Deep learning: Requirements

Main topics

Graphics card (GPU)

Video source

Sample rate

Subject species, color and size

Number of subjects and arenas

Video image

Video length

Individual marking (for two-subject tracking)

Test apparatus and background

Recording protocol

Behavior Recognition

Test results

 

Graphics card (GPU)

Neural networks make calculations over huge data matrices, and therefore require substantial computation power. In order for the Deep learning tracking technique to work in EthoVision XT, you need a Graphics Processing Unit (GPU, or graphics card) that is able to sustain those computations.

Furthermore, Deep learning makes use of TensorRT software development kit (version 8.6.1.6) which is built on the cuDNN deep neural network library, which in its turn relies on the CUDA computing platform. For this reason, the GPU driver must support CUDA runtime version 13.1.

If you intend to purchase a GPU for Deep learning, click here below for more information.

More information on graphics cards (GPUs)

Install a graphics card for Deep learning

Video source

Choose:

When tracking one subject per arena: Live tracking limited to up to four arenas, or From video file.

When tracking two subjects per arena: From video file.

Tracking two subjects live using Deep learning may lead to a high number of missing samples, depending on the power of GPU. Always test your setup first. If you note a significant number of missing samples (e.g. > 5%), do not track live. Instead, record video first, then acquire the trial later using Deep learning.

See the recommendations in Resolution, frame rate, and maximum trial duration

Sample rate

Accuracy of individual discrimination may go down when reducing the sample rate, for example from 25 to 12.5 samples/s. We strongly recommend that you track at the maximum sample rate available, that is, the frame rate of your camera. Choose the sample rate in the Detection Settings. See Sample rate

Subject species, color and size

The neural network has been trained with images of rats and mice of uniform color.

For hooded rats, like Lister and Long-Evans rats:

When tracking one subject per arena: Select Hooded rats in the Detection settings. This loads a neural network model specific for those animals.

When tracking two subjects per arena: the neural network for two interacting subjects has not been trained for the fur patterns of hooded rats. If you want to use hooded rats, test pairs of animals before the actual experiments, to check that the software produces acceptable results.

The apparent length of the subjects should be at least 10% of the size of arena. We recommend that the apparent length of the subject’s body is at least 120 pixels for rats and 50 pixels for mice (nose to tail-base).

See also Adjust the settings for nose-tail base detection (Deep learning)

Number of subjects and arenas

You can select Deep learning as body point detection technique in experiments with:

One subject per arena, in a maximum of four arenas.

Two subjects per arena, in a maximum of four arenas.

important  Always draw an arena in the Arena Settings for optimal results.

Video image

When working with one arena at a time, choose a resolution of 640 x 480 or higher.

When working with two to four arenas, choose a resolution of 1280x960 or 1280x1024, or similar.

In any case, use video of resolution equal to or higher than PAL/NTSC (PAL: 704 x 576; NTSC/EIA: 640 x 480). A higher resolution is not necessarily better, also considering that it makes tracking slower when tracking offline, or can cause missing samples when tracking live. First try a low resolution, and switch to a higher resolution if the results are not good. See also subject size below.

For live tracking, not all video resolutions and frame rates are compatible with Deep learning. See Test results and then click on the camera type for configurations tested with Live tracking combined with Deep learning: GigE cameras, USB 3.0 cameras, and Analog cameras.

EthoVision XT converts video to grayscale before feeding it to the neural network. Therefore, both monochrome and color video work fine.

Video length

One subject per arena: no restrictions. Maximal trial duration tested 72 hours. See Resolution, frame rate, and maximum trial duration 

Two subjects per arena: we recommend to perform trials of at least five minutes. Maximal trial duration tested 1 hour.

During the trials, the two subjects should be separated for at least three minutes. If the subjects are in contact for most of the trial duration, individual recognition may fail. Also consider that, with two-subject tracking:

EthoVision XT saves additional files during acquisition, which increase the storage space needed on your PC. A 1-hour video produces 10 MB of additional files.

With long trials, the marker could change or droppings would likely cumulate in the arena. Both factors could potentially interfere with marker recognition.

Individual marking (for two-subject tracking)

This section refers to tracking two subjects per arena with Deep learning (up to four arenas). If you track one subject per arena, you do not need to mark the animals. For successful application of Deep learning-based discrimination between two individuals, the interacting animals must look different, or be marked.

important  EthoVision XT does not detect differences in the subject’s body size. Also, toe clipping, ear punching or tagging won’t work. Differences in fur color and markings (also on the tail base) are essential for successful individual discrimination with EthoVision XT.

For information about non-invasive marking techniques, see Klabukov et al. 2023. Animals 13(22): 3452. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10668729/

Different fur color

If the two subjects are of different fur color, for example one white and the other brown/black, you do not have to mark them.

inset_5600081.jpg 

Tail marks

Tail marks only work if they are close enough to the abdomen of the subject. Only mark one subject, not the other. Make one ring leaving some space between that and the abdomen:

inset_5700082.jpg 

In the following figure the tail mark is too far from the abdomen. Individual discrimination won’t work.

inset_5900083.jpg 

Advantages: The procedure is painless and easy.

Disadvantages: It may require weekly marking, which could cause stress. Furthermore, droppings left in the arena could be detected as tail marks and affect individual discrimination. This could especially occur in long trials.

Shaving marks

You can use a hair trimmer to shave a patch of fur of one of the subjects. The patch must be located along the spine of the subject. If you decide to shave both subjects, they must exhibit different patterns. Such marks generally last 1-4 weeks, depending on the stage of the hair cycle.

inset_5200084.jpg 

Advantages: The procedure is painless; grooming and manipulation do not remove shaving marks; it may be performed on rodents of all colors.

Disadvantages: The marking is temporary due to hair regrowth; you need to monitor the animals regularly to assess the condition of the mark. In long-term experiments, you need to renew the mark as hair regrows. Also the marking may look too irregular, reducing identification rate.

Fur staining

White rodents can be marked with non-toxic dye or felt tip pens. When using infrared illumination, make sure that the marks are visible under infrared light. Mark only one of the subjects.

inset_5000085.jpg inset_5100086.jpg

Advantages: The procedure is easy and painless, and may be used with rodents of all ages.

Disadvantages: Dyes may fade over time or due to grooming; daily monitoring is needed to assess the condition of the mark; it may be used predominantly on white-furred animals; there is potential adverse response to solvents and odor.

Do not make V-shape markings or separate markings that converge on the back of the subjects, since they could interfere with the detection of the nose point, like in the following example:

inset_6000087.jpg 

Bleaching

Here the same remarks for shavings apply.

Advantages: The procedure is painless and bleach marks are not removed during grooming.

Disadvantages: It may be used on dark-furred rodents only. Furthermore, the software won’t work optimally if the part bleached is irregular and so large that it breaks the subject’s contour, like in the following example:

inset_5800088.jpg 

important  Remove bleach solutions to avoid skin damage.

Color marking

Color marks could only work when they exhibit different light intensity in grayscale. For example, light pink and blue. In the example below, different colors may look too similar in grayscale.

inset_5300089.jpg 

A safer option is to mark one subject, not the other (see above). For more details about color marking, see Tips for color tracking.

Marker location

Make sure that the marks are visible in most cases, also when the animal is sitting curled up. In the example below (right), the mark is not visible.

inset_6100090.jpg 

In the following example, the two marks on the rat’s shoulders are just too small (left). Combined with the effect of direct lighting (to be avoided in all cases), the marks disappear a few frames later (right). Individual recognition won’t work here.

inset_6400091.jpg 

Size of marks

The mark must be at least 4 x 4 pixels, after the arena has been rescaled to 256 x 256. This applies to tail marks and marks on the back of the animal.

If in doubt, do the following:

1.Open the video file at zoom level 1:1 (Original size).

2.Make a screenshot of the video window.

3.In a paint program, cut the image to include approximately only the arena.

4.Rescale the arena to 256 x 256 pixels. For rectangular arenas, rescale in such a way that the longest side is 256 pixels. Maintain the original aspect ratio.

5.After rescaling, the marker should be at least 4 x 4 pixels, as shown in the following image.

inset_4900092.jpg 

Hooded rats

The neural network for two-subject tracking and identification has not been trained with the color patterns of hooded rats, so we cannot guarantee that it works when tracking two hooded rats. Do some tests before the actual experiments, to verify that it provides acceptable results.

Test apparatus and background

Test apparatus

EthoVision XT’s neural networks have been trained with videos of:

One subject per arena, subjects of uniform color: Open field (regular or with round objects in it), PhenoTyper (with or without bedding material), Elevated plus maze, Three-chamber social approach cage, Barnes maze, Fear conditioning cage with floor grid, Y-maze (with no objects), and a maze with multiple chambers and openings. See below the note about objects.

One subject per arena, hooded rats: Open field (regular or with round, rectangular or triangular objects in it), PhenoTyper (with or without bedding material), and Elevated plus maze.

Two subjects per arena: Open field (with no objects), PhenoTyper (with or without bedding material), and home cage.

Size of the arena

One subject per arena: No particular requirements, but see Video image above.

Two subjects per arena: Small containers reduce the probability that the two animals are separated for sufficient time. This condition is necessary for success with Deep learning-based individual recognition.

Background

The figure below shows an example of a good background where the rat’s apparent size is about 150 pixels.

inset_3100093.jpg 

If the animal is small relative to the arena, and there is no way to zoom in the camera image, try to improve the contrast with the background, for example by providing more infrared light, in order to compensate for the lower level of spatial detail.

Low contrast. In the following example, the mouse is large enough (about 80 pixels; see above) but the contrast with the background is too low. To solve this, increase lighting, or open up the lens’ aperture, or increase the camera gain. See Adjust camera settings in EthoVision XT

inset_3300094.jpg 

Floor grids are compatible with the Deep learning detection technique. If necessary, reduce the amount of light to minimize the reflections and shadows caused by the metal bars.

inset_3500095.jpg 

Bedding material can also be used with Deep learning-based tracking. However, there should be enough contrast with the animals. In this example with two mice in a PhenoTyper, the dark individual often went undetected.

inset_4700096.jpg 

Try increase lighting or, if possible, switch to another type of bedding.

Too much bedding/nest material can cause occlusions when the animal digs in it. In that case the subject detection and discrimination may not work properly.

inset_4600097.jpg 

Here below are examples of setups that worked.

inset_5400098.jpg inset_5500099.jpg

Objects. For one-subject tracking, the apparatus can contain objects, like in the Novel Object Recognition test, or in the Sociability test. Whenever possible, use objects of color different from the subject’s color.

inset_3600100.jpg 

Blind corners. When working with multiple arenas simultaneously, check that there are no blind corners, which may reduce detection rate.

inset_4800101.jpg 

Corridors and walls. For elevated plus mazes, radial mazes and other apparatuses with corridors, make sure that the walls are not of the same color as the subject.

In the following example, an excess of light from one side of the test room makes the top of the walls of this plus maze look white. A white mouse is still detected but when it touches the walls the nose-point and/or the tail-base point are no longer detected. Dim the lights and adjust their orientation to reduce the reflections.

inset_3200102.jpg 

When you define the Cutout size in the Detection Settings, try not to include walls and objects in the cutout box, especially if those objects are the same color as the animal. See Adjust the settings for nose-tail base detection (Deep learning)

Hole boards. Deep learning-based tracking also performs well in situations of low contrast between the subject and the holes, like in the following example.

inset_6500103.jpg 

tip  Define the target hole as a hidden zone. See Shelters and other hidden zones

Apparatuses with backlight

One subject per arena: backlight is compatible but may not be ideal. In the following example, the mouse explores a plus maze with an infrared-lit background (left). When the mouse dips the head below the level of the open arms (right), the nose is not found. To solve this, place dim lights on the floor.

inset_3400104.jpg 

Two subjects per arena: backlight is not compatible.

Tethered animals

One subject per arena: When running experiments with tethered animals, the contour of the subjects seen by EthoVision XT is often disrupted by the fiber. The Deep learning technique may not be able to resolve the position of the nose. We did not test Deep learning with tethered animals. Therefore, run a few tests with sample videos to make sure that detection of the nose point is accurate enough. To improve detection of tethered subjects, use the Dilation and Erosion filter options. See an example in Advanced detection settings: Subject contour

Two subjects per arena: this setup has not been tested thoroughly, therefore applying Deep learning is not recommended.

Recording protocol

One subject per arena: No particular limitations.

Two subjects per arena: preferably, tracking should start when both animals are in the arena. The software also works when you release one animal first, then the other. However, for best results, ensure that the second animal is released within up to 30 seconds after the first.

With the Trial Control Settings, you can ensure that the data acquisition starts when both subjects are in the arena. See Start the trial in the Social interaction test

Behavior Recognition

Unfortunately, it is not yet possible to combine Deep learning for body point detection and Behavior Recognition for behavior classification in the same experiment. When you select Deep learning under Body Point Detection Technique in the Experiment Settings, the Behavior Recognition option is grayed out.

Test results

We tested Deep learning with various combinations of PCs, graphics cards and cameras. We recommend to use a high-end workstation when working with Deep learning - based tracking. Click the link that applies based on which camera type you have:

GigE cameras

USB 3.0 cameras

Analog cameras

In all the tests we used live tracking + save video, with one arena per camera. All tests were based on an open-field experiment with a black mouse and a clearly contrasting gray background.

See also

Cameras supported by EthoVision XT

System requirements > Hardware